Bioinformatics (Thomas Dandekar, Meik Kunz)

336

20.11 Design Principles of a Cell

Questions 11.1 to 11.7

Transfer RNA (tRNA) mediate the translation of the correct amino acids from the RNA

code, this happens at the ribosomes. Biophysical laws determine the structure (e.g. hydro

gen bonds, hydrophobic interaction), but also other effects such as crowding. However,

these are so complex that the exact process of the formation of the three-dimensional

protein structure has not yet been completely deciphered (e.g. via “molten globule“state).

However, since many protein sequences and protein domains are known, much informa

tion about function and structure can be obtained from databases. For example, much

information and resolved three-dimensional structural coordinates together with annota

tions for the protein can be found in the PDB (https://www.rcsb.org/pdb/home/home.do)

and UniProt (https://www.uniprot.org/) databases. In addition, there are also classification

databases, for example according to sequence and structural similarity such as SCOP

(structural classification of proteins; https://scop.mrc-lmb.cam.ac.uk/scop/, from 2010

continued with SCOP extended; https://scop.berkeley.edu) and CATH (classification by

class, architecture, topology and homology; https://www.cathdb.info/), or according to

protein families and function the databases PROSITE (https://prosite.expasy.org/) and

Pfam (https://pfam.xfam.org/). Thus, it is possible to obtain predictions of protein struc

ture and function through experiments and bioinformatic modelling (e.g. differential equa

tions and simulations). In this context, there are different approaches to predict protein

structure from a sequence, e.g. ab-initio and comparative predictions (e.g. homology mod

eling, threading). Ab-initio predictions are based on the biophysical properties of proteins,

whereas homology modeling uses known protein structures. There are many useful soft

wares to visualize (e.g., hydrogen bonds or hydrophobic regions) and analyze (e.g., dock

ing and modeling) protein structures, such as PyMOL (https://www.pymol.org/), RasMol

(https://www.openrasmol.org/), and Swiss-PdbViewer (https://spdbv.vital-it.ch/). A pro

tein structure analysis can be performed bioinformatically, e.g. with AnDom (contains

three-dimensional structural domains based on SCOP classification), SWISS-MODEL

(https://swissmodel.expasy.org/), I-TASSER (Iterative Threading ASSEmbly Refinement;

https://zhanglab.ccmb.med.umich.edu/I-TASSER/) or with a Ramachandran plot, which

provides information about possible structures, domains and function. A Ramachandran

plot (e.g., RAMPAGE software; https://mordred.bioc.cam.ac.uk/~rapper/rampage.php)

calculates the phi and psi torsion angles in the protein, thus providing a graphical overview

of the distribution of alpha helices and beta leaflets.

Questions 11.8 to 11.11

I can find a possible function for a protein if I look in the sequence for possible sequence

motifs and protein domains, i.e. independent folding units. This shows me, for example,

whether an active site, a regulatory domain or interaction domains are present in my

20 Solutions to the Exercises